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Initi.il  Value  Problems:  Practical  Theoretical  Developments 

C.  'J.  Gear 
Department  of  Computer  Science 
University  of  Illinois  at  Urbana-Champaign 
Urbana,  IL  61P01,  USA 

Ah st  rar  t 

T^>  i  s  paper  surveys  several  recent  developments  in  the  theory  of  multistep 
initial  value  solvers  that  are  useful  for  the  implementation  of  practical 
codes.   These  developments  are  concerned  with  stability  of  variable 
step/order  methods,  starting  at  a  high  order  (and  hence,  with  a  large 
step),  and  the  intepration  of  problems  with  rapidly  oscillating  solutions. 

Int  roduc  t  ion 

The  worV  surveyed  here  is  aimed  at  improving  the  efficiency  of  multistep, 
variable  order,  variable  step  codes  for  initial  value  problems.   A  code 
will  be  efficient  if  it  can  use  a  large  step  size,  and  can  rapidly  increase 
the  step  when  a  change  in  the  nature  of  the  problem  warrants  it.   One  of 
the  limits  to  step  size  changing  is  the  problem  of  stability,  so  it  is 
important  to  use  as  weak  a  restriction  on  step  size  as  possible  consistent 
with  a  guarantee  of  stability.   Further,  the  restriction  should  be  easy  to 
irplemnnt  so  that  it  does  not  contribute  to  the  overhead.   Recent  results 
have  shown  that  a  very  simple  restriction  that  bounds  the  ratio  of  adjacent 
steps  above  and  below  by  constants  specific  to  the  class  of  methods  used  is 
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sufficient  to  guarantee  stability. 

Comparisons  of  codes  by  numerous  workers,  including  Hull  et  al  [8]  have 
shown  that  variable-order,  multistep  methods  are  among  the  most  competitive 
for  initial  value  problems  except  in  some  circumstances.   The  exceptions 
occur  when  a  function  evaluation  is  extremely  inexpensive,  in  which  case 
the  increased  overhead  of  multistep  methods  makes  them  less  attractive  than 
Runge-Kutta  codes,  or  when  a  problem  has  a  large  number  of  discontinuities 
which  cause  relatively  expensive  restarts.   A  technique  has  been  developed 
which  allows  a  multistep  method  to  start  (or  restart)  at  a  high  order  and 
large  step  size,  and  yet  uses  relatively  few  function  evaluations.   It  is  a 
Runge-Kutta-like  technique,  and  will,  for  example,  let  a  method  start  at 
fourth  or  fifth  order  after  only  6  function  evaluations.   It  also  provides 
information  that  allows  a  reasonable  estimate  of  the  initial  step  to  be 
made.   Preliminary  tests  indicate  that  it  is  efficient  and  allows  multistep 
methods  to  be  used  in  several  cases  where  previously  they  were  not 
competitive.   The  starting  technique  has  the  additional  advantage  that  it 
can  be  inserted  into  many  of  the  popular  existing  codes  with  very  little 
difficulty,  as  it  can  be  used  to  calculate  whatever  type  of  information  is 
stored  as  the  history  of  the  problem,  whether  past  function  values,  past 
derivative  values,  divided  or  backward  differences,  or  high-order 
derivatives. 
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There  are  still  sone  problems  whose  numerical  computation  is  so  difficult 
as  to  be  effectively  impossible.   Among  this  class  of  problems  were  the 
problems  with  highly  oscillatory  solutions.   Some  authors  [11]  have  termed 
these  "stiff  oscillatory"  problems  because  the  difficulty  can  occur  when 
there  are  imaginary  eigenvalues  large  compared  to  the  interval  of 
integration.   However,  I  think  this  is  an  unfortunate  misnomer  as  the  basic 
difficulty  in  the  conventional  stiff  problem  is  to  ignore  the  effect  of 
components  not  present  in  the  solution.   In  my  dictionary,  a  problem  is 
stiff  only  when  an  eigenvalue  has  a  large  negative  real  part  relative  to 
the  activity  in  the  solution.   If  we  have  a  problem  with  a  large  imaginary 
eigenvalue,  and  the  solution  contains  no  component  corresponding  to  that 
eigenvalue,  then  we  might  consider  it  as  a  stiff  problem,  and  some  of  the 
techniques  for  stiff  problems  will  work  (but  often  not  the  backward 
differentiation  formulae  in  DIFSUB).   However,  if  the  eigenvalue  is  almost 
pure  imaginary  so  that  the  component  has  no  damping,  the  probability  is 
that  the  oscillating  component  will  be  present  in  the  solution.   If  the 
problem  is  linear,  it  does  no  harm  to  ignore  that  component,  but  in  the 
class  of  the  much  more  interesting  non-linear  problems,  the  behaviour  of 
the  oscillation  must  be  considered.   Thus,  even  if  the  oscillation  is  due 
to  an  imaginary  eigenvalue,  we  need  methods  to  take  account  of  the 
oscillation  in  general.   Furthermore,  many  oscillations  are  non-linear  in 
nature,  for  example  the  relaxation  oscillation  of  the  van  der  Pol  equation 


-  A  - 


2 
+  k(y   -  l)y   +  y  -  0,   k  >  0 


in  which  the  eigenvalues  switch  rapidly  from  the  negative  to  the  positive 
half  plane.   The  computational  objective  may  be  one  of  several:  for 
example,  to  determine  the  nature  of  the  oscillation  after  an  arbitrarily 
long  time  (sometimes  called  the  "periodic  steady  state  problem"),  to  find 
the  phase  of  the  oscillation  after  a  long  time,  to  determine  the  amplitude 
of  the  oscillation,  or  to  determine  the  behaviour  of  the  underlying  "non- 
oscillatory"  components.   Only  in  the  latter  case  are  we  interested  in 
ignoring  the  oscillatory  components  altogether,  and  it  appears  that  even 
then,  their  effect  must  be  considered  if  the  problem  is  non  linear.   The 
PhD  thesis  of  a  recent  Illinois  graduate,  L.  Petzold  [12],  gives  extensions 
of  some  methods  developed  by  various  workers,  including  Graff  and  Bettis 
[6],  Mace  and  Thomas  [10],  and  others,  and  studies  their  accuracy.   These 
methods  are  relatively  easy  to  embed  in  standard  codes  such  as  DIFSUB. 

1      Stability 

It  has  been  known  for  some  time  that  arbitrary  step  changes  and/or  order 
changes  can  make  some  multistep  methods  unstable.   One  of  the  first 
positive  results  in  this  direction  was  due  to  Piotrowski  [13]  who  showed 
that  Adams  methods  are  unconditionally  stable  if  Implemented  in  a  variable 
step  form.   Later,  Tu  [14]  showed  that  the  conditions  on  programs  using 
interpolation  techniques  for  changing  the  step  size  have  to  be  more  strict, 
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at  least  for  Admas  methods,  as  there  exist  sequences  of  steps  for  which  the 

three-step  Adams  Bashforth  Moulton  method  is  unstable.   Jackson  and  Skeel 

[9]  Investigated  these  methods  thoroughly  and  showed  that,  for  any  strongly 

stable  nultlstep  method  using  the  interpolation  scheme  for  step  changing, 

there  exist  constants  0  <  a  £  b  <  1  <B  such  that  the  program  is  stable  if 

the  ratio  of  adjacent  step  sizes  h   .  and  h   is  such  that  either 

n-1      n 


b  <  h  /h   .  <  B 
n   n-1 


or 


0  <  h  /h   .  <  a 
n   n-1 


This  means  that  either  the  rate  of  change  of  step  size  is  restricted,  or 
the  step  is  reduced  radically.   This  is  a  very  practical  restriction  as  it 
Is  easy  and  inexpensive  to  implement,  and  it  allows  a  very  small  step  to  be 
used  when  a  sharp  reduction  is  necessary  because  of  a  sudden  change  in  the 
behavior  of  the  solution.   All  that  is  necessary  is  to  check  the  step  ratio 
recommended  by  your  favorite  error  control  scheme,  and  reduce  it  to  B  if  it 
is  larger  than  B,  or  to  reduce  it  to  a  if  it  is  between  a  and  b.   More 
recently,  Gear  [4]  has  shown  that  the  same  condition  is  sufficient  for 
multlstep  methods  to  be  stable.   While  the  underlying  reason  is,  in  one 
sense,  quite  technical,  it  admits  a  very  simple  intuitive  explanation  which 
is  given  in  [4]. 
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2_     Starting  multlstep  methods. 

Most  current  multlstep  codes  are  variable  order,  and  this  feature  allows 
them  to  start  at  a  low  order  corresponding  to  a  one-step  method.   This  Is 
typically  first  or  second  order,  and  implies  a  fairly  small  time  step  for 
modest  accuracy.   Consequently,  the  initial  steps  are  fairly  expensive  in 
that  they  use  a  large  number  of  function  evaluations  before  the  order 
reaches  a  level  at  which  the  step  is  close  to  maximum.   This  is  not  a 
serious  drawback  when  a  problem  has  to  be  integrated  over  a  considerable 
range,  since  the  number  of  function  evaluations  in  the  start-up  is  far  less 
than  the  number  used  over  the  rest  of  the  range.   However,  if  the  problem 
has  frequent  discontinuities,  a  multlstep  method  must  restart  after  every 
discontinuity.   In  this  case,  a  Runge-Kutta  method  is  frequently  preferable 
because,  as  long  as  the  discontinuities  are  at  mesh  points,  a  one-step 
method  is  unaffected  by  discontinuities  in  derivatives.   A  recently 
developed  technique  given  in  Gear  [5]  overcomes  this  problem  by  using  a 
Runge-Kutta  like  starter  to  generate  values  for  a  multlstep  method.   To 
illustrate  the  idea,  let  us  consider  the  question  of  finding  a  set  of 
values  for  starting  a  4-th  or  5-th  order  in  which  the  predictor  is  fourth 
order.   This  requires  a  4-step  method  if  we  want  strong  stability.   At  the 
start,  or  after  a  discontinuity,  we  have  a  single  starting  value,  so  the 
problem  is  to  generate  three  additional  values  with  appropriate  accuracy. 
Since  we  are  going  to  use  a  variable  order  method  subsequently,  it  does  not 
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make  much  sense  to  talk  about  order  as  a  measure  of  accuracy,  but 
unfortunately  nobody  has  offered  an  alternative  (and  many  of  us  have 
pondered  this  problem  for  some  time).   Therefore,  we  will  ask  that  the 
three  additional  values  be  generated  to  fourth  order  accuracy.   (Why  not 
fifth  order?  —  since  restarting  occurs  a  fixed  number  of  t  Lmes,  the 
contribution  of  a  fourth  order  error  at  each  restart  is  a  global 
contribution  of  fourth  order,  so  four  seems  a  reasonable  choice.)  Given  the 
equation  y'  -  f(y,t)  and  a  single  value  of  the  pair  (y,t),  it  is  clear  that 
the  first  thing  we  do  has  to  be  an  evaluation  of  f(y,t).   Consequently,  if 
we  somehow  calculate  y  to  fourth  order  accuracy  at  three  additional  points, 
we  will  have  five  fourth-order  accurate  values  available,  y  and  y'  at  the 
initial  point  and  three  additional  values  of  y.   These  uniquely  define  a 
fourth-order  polynomial  taking  the  same  values.   Therefore,  the  problem  is 
to  determine  the  coefficients  of  this  polynomial,  or  any  set  of  five  values 
that  uniquely  determine  them.   The  approach  taken  in  [5]  is  to  ask  about 
the  existence  of  Runge-Kutta-like  processes  that  will  generate  these  values 
to  fourth-order  accuracy.   To  be  precise,  we  ask  if  it  is  possible,  using  a 
q-stage  explicit  Runge-Kutta  process,  to  calculate  the  values  of  the  scaled 
derivatives 


.       '  ,2  (2)        .p  (p)  . 
(y0»  hyQ,h  yQ   ,  ...  ,  hKy0P   ] 


to  p-th  order  accuracy.   The  process  used  is 
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i-1 
k  -hf(y  +  IB  k  )   i  -  l,...,q 
j-1  J  J 

.  s  (s)  2-         ,        . 

h  y   -  IYfl,k    s  -  l,...,p 

j-1  SJ  J 
Although  this  process  ressembles  the  Runge-Kutta-type  methods  that  generate 
several  approximations  of  different  orders  (see,  for  example,  Bettis  [1], 
and  Verner  [15]),  it  is  not  the  same,  as  this  is  equivalent  to  generating 
several  output  values  at  the  same  order.   Reference  [5]  shows  that  the 
minimum  value  of  q  for  p  ■  1  to  4  is 


p 

minimum  q 

1 

1 

2 

2 

3 

4 

4 

6 

The  minimum  value  of  q  for  other  p  is  not  known,  although  it  has  been  shown 
that  the  minimum  is  no  larger  than  1  +  p(p-l)/2.   Of  course,  the 
presentation  of  a  set  of  formula  does  not  necessarily  lead  to  a  practical 
algorithm;  there  is  still  the  very  important  question  of  step  selection, 
and  this  is  now  a  double  problem:  what  step  to  use  in  the  Runge-Kutta 
process,  and  what  step  to  use  in  the  first  multistep  step.   Note  that  the 
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Runge-Kutta  process  does  not  actually  take  a  step  as  we  have  presented  it. 
If  its  output  is  used  to  start  a  code  that  uses  a  Nordsieck  vector,  this 
remains  true.   If,  on  the  other  hand,  we  use  it  with  a  code  that  retains 
past  function  or  derivative  values,  we  must  replace  the  linear  combinations 
of  the  k's  with  other  linear  combinations  that  give  the  required  values  of 
y  or  hy'  at  a  set  of  points.   We  could  choose  either  points  forward  of  the 
starting  point,  or  points  behind  the  starting  point.   My  preference  is  the 
latter.   Then,  we  do  not  have  to  be  concerned  about  the  Runge-Kutta 
starting  process  introducing  error  directly  into  the  solution,  but  only 
with  the  propogation  of  errors  it  introduces  in  the  additional  values.   The 
step  size  for  the  Runge-Kutta  process  must  be  chosen  small  enough  that  this 
propogated  error  is  reasonable,  but  large  enough  so  that  roundoff  errors  do 
not  overwhelm  other  errors.   If  we  use  a  fourth  order  process,  for  example, 
then  we  can  determine  if  the  step  size  is  within  these  (hopefully  generous) 
bounds  by  examining  the  values  of  the  scaled  derivatives  calculated.   While 
it  is  true  that  the  error  is  an  unknown  fifth-order  term,  we  can  use  the 
popular  approach  of  controlling  the  size  of  the  fourth-order  terms,  coupled 
either  with  a  small  prayer  or  a  generous  safety  factor,  determined  by  your 
philosophical  predilections.   At  first  sight  it  seems  that  the  scaled 
fourth  derivative  times  an  error  coefficient  should  be  less  than  the  error 
tolerance  e  (scaled  by  the  function  value  for  relative  errors).   However, 
this  is  really  the  error  control  we  would  apply  if  we  were  using  the  RK 
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information  to  take  a  step.   Suppose  the  h  we  use  in  the  RK  step  is  larger 
than  this.   Clearly  we  will  reduce  the  step  in  the  multistep  method  so  that 
its  error  is  within  tolerance.   When  we  do  this,  we  will  rescale  the  scaled 
derivatives  calculated  in  the  starter  by  powers  of  the  step  size  ratio. 
The  second  scaled  derivative  is  the  first  one  we  calculate  that  has  errors 
(the  first  derivative  has  no  truncation  error).   This  is  scaled  by  the 
square  of  the  step  ratio.   After  the  scaling,  we  would  like  the  fifth-order 
error  present  in  that  term  to  be  less  than  e.   However,  we  agreed  to 
estimate  the  error  by  looking  at  the  fourth  order  derivative.   Bringing 
this  together,  we  get: 


Original  scaled  second  derivative  is 


,2  (2) 

h  y    +  error 


Assume  that  the  error  is  given  by 


ChV4> 


2 
Error  in  second  derivative  after  scaling  by  (h   /h)   is 

new 


n-ch2y(4)(h   )2 

J  new 


If  the  error  control  in  the  first  step  taken  is  such  that 
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4  nevr 


we  can  calculate  h    ,  Insert  it  in  the  previous  expression  to  net 

new'  r       .  r  b 


E  -  Ch2y(4) 


V 


(4) 


1/2 


If  we  want  to  control  E  so  that 


KE  <  e 


we  find  that 


C   e 
u4  (4)    4 
h  y 


—  2  2 
K  C 

This  is  the  test  that  must  be  applied  to  see  whether  to  reject  the  Runge- 

Kutta  starter.   (We  have  assumed  that  the  error  in  the  rescaled  third 

scaled  derivative  will  be  smaller  and  can  be  ignored.)  The  fourth  scaled 

derivative  should  also  be  about  an  order  of  magnitude  larger  than  u,  the 

unit  round-off  error  (also  scaled  by  the  function  value  if  necessary). 

Thus  we  see  that  the  test  of  acceptability  for  the  Runge-Kutta  starter  is 

simply  a  check  on  the  size  of  the  fourth  derivative  (or  p-th  derivative,  in 

general.   However,  we  have  to  be  beware  of  the  possibility  of  zero  p-th 
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order  derivatives.   These  are  not  so  uncommon  at  initial  values,  so  it  is 
important  to  consider  the  value  of  all  derivatives  up  to  the  p-th  in 
estimating  the  error.   Our  approach  is  to  assume  that  the  behaviour  of  the 
q-th  derivative  is 


lly(q)H  -  Aq||y|| 

(1) 

Using  this,  we  calculate 


llyll     "  [pjtiL"yll    J     J 


(2) 

as  an  approximation  to  the  p-th  derivative  relative  to  y.   (If  ||y||  is 
less  than  1,  we  replace  ||y||  with  1.) 

Once  the  Runge-Kutta  starter  has  been  performed  with  a  satisfactory  step  we 
can  choose  a  step  for  the  first  application  of  a  multistep  method.   Once 
again,  we  use  the  estimated  p-th  derivative  to  estimate  the  error  in  a 
(p-l)st  order  method,  even  though  we  may  use  a  p-th  or  (p+l)st  order 
method.   If  the  error  in  the  (p-l)st  order  version  of  the  method  used  is 


C  hPy(p) 
P 


we  choose  an  initial  h  based  on 
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new 


Cp||y(p)M 


1/p 


We   have    also    tried    using 


new 


e     llvll 


Cq||y(q)|| 


1/q 


with  q  ■  p+1  and  p+2,  where  our  estimate  of  the  q-th  derivatives  of  y  is 
based  on  the  approximation  in  eq.  (1)  and  the  estimate  in  eq.  (2).  This 
yields    the   estimate 


ly 


(q) 


ly 


>1 


h. 


(J) 


ly 


i/j 


All  this  does  is  to  change  the  dependence  of  the  initial  step  on  e  and  to 
change  the  coefficient  used. 

2. 1  Detecting  Discontinuities 


If  an  automatic  step  control  integrator  is  given  no  further  information 
about  discontinuities,  the  step  control  mechanism  can  be  very  inefficient 
in  the  neighborhood  of  a  discontinuity.   This  occurs  because  an  attempted 
step  that  straddles  a  discontinuity  yields  a  very  large  error  estimate, 
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usually  causing  the  code  to  reject  the  step  and  reduce  the  stepslze  sharply 
so  that  the  step  not  only  no  longer  straddles  the  discontinuity  but  often 
so  that  it  falls  far  short  of  the  discontinuity.   Consequently,  several 
small  steps  are  taken  until  the  step  is  once  again  increased  to  a 
reasonable  size  so  that  an  attempted  step  again  straddles  the 
discontinuity.   This  process  is  repeated  with  a  large  cost  in  function 
evaluations.   Codes  have  used  a  number  of  techniques  to  reduce  the  problem. 
DIFSUB,  an  early  code,  reduces  the  order  to  one  and  restarts  when  this  type 
of  difficulty  is  first  encountered,  and  then  quits  if  it  occurs  three  times 
in  succession—a  simple  but  not  too  useful  solution.   Some  codes  resort  to 
binary  division  of  the  interval  when  the  error  tests  are  inconsistent. 
(The  error  has  a  known  asymptotic  behavior  if  no  discontinuities  are 
present;  it  is  possible  to  check  for  gross  deviation  from  such  behavior  as 
an  indication  of  discontinuities.)  Binary  division  may  be  as  good  as  any 
strategy  if  no  further  information  about  the  discontinuity  is  available. 
However,  if  we  know  exactly  when  each  discontinuity  occurs,  it  is  far  more 
efficient  to  integrate  exactly  to  the  discontinuity  and  then  restart. 

Discontinuities  can  arise  from  two  sources:   the  driving  (t-dependent) 
terms  and  the  dependent  variables.   An  example  of  a  driving  term 
discontinuity  occurs  in  smog  simulation.   The  sun's  energy  input  is  a 
time-dependent  term  that  has  two  discontinuities  each  24  hours  as  the  sun 
rises  and  sets.   Dependent  variable  discontinuities  frequently  arise  in 
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simulation.   For  example,  a  relay  leads  to  equations  of  the  form 


if  |IR|  >  I   ,  .      ,  then  switch  ♦  on 
critical 

if  |IR|  <  I  .   then  switch  ♦  off 

min 

if  switch  ■  on  then  V  ■  0 
else  1*0 


where  IR  is  the  current  in  the  relay  coil,  and  V  and  I  are  the  voltages 

across  and  the  current  through  the  relay  contracts.   This  means  that  as  IR 

(which  is  a  function  of  the  dependent  variables)  increases  to  the  point  it 

passes  I   ,  ,      ,,  the  relay  turn  on  current,  the  coefficients  in  the 
r  critical 

differential  equations  change  discontinuously.   A  similar  problem  arises  in 
some  mathematical  models  that,  although  infinitely  dif ferentiable,  exhibit 
such  sudden  changes  they  are  indistinguishable  from  discontinuities.   For 
example,  a  diode  is  sometimes  modeled  by  an  exponential  function  such  as 


t     ,    40v    T> 
I  *  c(e    -  I) 


where  c  is  the  reverse  leakage  current.   The  constant  40,  or  some  large 
number,  causes  a  behavior  similar  to  that  shown  in  Figure  1.   In  such  cases 
either  the  model  should  be  changed  to  a  discontinuous  model  (e.g.,  a 
piecewise  linear  function)  or  detection  schemes  which  indicate  when  the 
nature  of  the  model  changes  radically  should  be  added. 
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The  basic  problem  is  to  detect  when  a  discontinuity  occurs  in  time  so  that 

when  a  step  is  about  to  straddle  it,  the  stepslze  can  be  reduced.   This  is 

trivial  for  time-dependent  discontinuities,  so  let  us  consider  the  problem 

of  functions  of  dependent  variables.   Let  us  suppose  that  a  discontinuity 

occurs  when  an  expression  e(y)  crosses  zero.   Since  y  is  a  function  t,  we 

can  evaluate  e(y(t))  at  each  integration  step.   We  could  use  a  set  of  past 

values  e   . ,  i  ■  0, . . . ,  k  from  a  multistep  method  to  estimate  the  time  at 
n-i 

which  the  next  zero  crossing  will  occur  and  reduce  the  step  of  that  time  is 
within  the  next  interval.   However,  that  is  time-consuming  because  of  an 
inverse  interpolation  in  each  interval  and  error  prone  because 
extrapolation  gives  larger  errors  than  interpolation.   Therefore,  we  would 
like  to  be  able  to  take  the  next  step  and  then  interpolate.  This  is 
possible  only  if  there  is  no  discontinuity  in  the  tep.   The  technique 
employed  by  many  people  (see  Carver  [2]  is  to  delay  changing  the  "switch" 
until  the  completion  of  a  step.   Thus,  if  e(y)  is  an  expression  such  that  a 
swtich  is  to  change  when  e(y)  changes  from  negative  to  positive,  e(y)  is 
evaluated  only  at  the  end  of  each  step.   If  e(y)  remains  negative,  the 
integration  proceeds  normally.   However,  if  the  value  of  e(y)  is  found  to 
be  positive  at  the  end  of  a  step,  it  indicates  a  discontinuity  somewhere  in 
that  step.   Inverse  interpolation  can  then  be  employed  to  determine  the  t 
value  at  which  e(y)  is  zero.   Interpolation  can  then  be  employed  to 
calculate  the  dependent  variables,  although  it  is  probably  better  to 
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integrate  from  the  last  value  of  t,  and  posible  iterate  this  procedure  one 

time. 

3^     Oscillating  problems 

At  the  outset,  we  must  distinguish  between  two  types  of  problems;  those 
whose  oscillation  is  due  to  a  forcing  term  (time  dependent  term),  and  those 
whose  oscillation  is  "internal",  that  Is,  due  to  a  linear  or  nonlinear 
oscillator  in  the  equations.   Initially  we  will  discuss  the  latter,  and 
later  indicate  extensions  to  the  former  case.   We  expect  the  solution  to 
have  the  behaviour  sketched  in  Figure  2,  and  our  goal  is  to  determine  the 
nature  of  the  solution  after  a  very  long  time;  that  is,  after  a  large 
number  of  cycles.   First,  we  must  be  more  precise  about  what  we  want  to 
know  about  the  solution.   Chances  are  that  we  either  want  to  know  the  form 
of  the  oscillation,  or  the  "envelope"  of  the  solution;  that  is,  the  curves 
zu  and  zT  in  Figure  2. 

Unfortunately,  there  Is  no  such  thing  as  an  envelope  of  a  single  curve,  and 
the  solution  of  an  ODE  with  an  initial  value  is  a  single  curve.   However, 
it  is  quite  obvious  to  the  casual  observer  what  we  mean  by  the  envelope  In 
the  case  of  a  very  rapidly  oscillating  solution.   Our  objective  was  to 
develop  a  method  which  would  allow  us  to  calculate  a  smooth  curve  that  was 
the  "envelope"  so  that  we  could  use  large  step  sizes.   Therefore,  we 
defined  a  (hopefully)  smooth  curve,  called  a  quasi  envelope,  and  have 
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derived  approximate  differential  equations  for  it.   The  essence  of  the 
method  is  to  integrate  this  approximate  differential  equation  using  large 
time  steps. 

The  quasi  envelope  is  illustrated  in  Figure  3.   Suppose  the  "period"  of  the 
oscillation  is  T.   (We  will  discuss  how  this  is  defined  later,  and  indicate 
how  a  "varying  period"  is  handled.   For  the  moment  assume  that  we  have  been 
given  the  value  of  T.)  If  the  initial  value  of  y  at  t_  is  yn,  we  choose  a 
function  z(t)  for  tn  _<  t  _<  t,.  +  T  such  that  z  and  y  agree  at  tn  and  t_  +  T. 
Between  these  values  z(t)  should  be  smooth.   Now  we  extend  z(t)  to  all 
1  !   tn  +  T  by  defining  z(t  +  T)  to  be  the  value  of  the  solution  at  t  +  T  of 
y'  =  f(y,t)  with  initial  values  t,  z(t).   Hence,  z(t)  is  a  collection  of 
points  such  that  if  y  and  z  agree  at  time  t,  they  agree  at  times  t  +  kT  for 
all  integer  k.   The  smoothness  of  z(t)  depends  in  a  complex  way  on  the 
smoothness  of  the  initial  section  in  [0,T],  but  this  is  only  an  impediment 
to  the  theoretical  treatment  of  truncation,  not  to  the  practical  algorithm. 

Formally,  we  define  the  quasi  envelope  by 


z(t  +  T)  «  z(t)  +  Tg(z,t) 

(3) 

where 
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g(z,t)  -  hy(t   +  T,  t)  -  y(t,  t)] 

(4) 


and  y(t,  t )  is  the  solution  of 


^T  y(t,  t)  -  f(y(t,T),  t),  y(T,T)  -  z(t) 


An  example  of  y  is  the  dashed  curve  in  Figure  3.   It  "moves  the  solution 
forward  one  cycle."  At  first  sight,  many  find  the  quasi  envelope  confusing. 
"What  if  we  choose  an  initial  z(t),  0  _<  t  <T  such  that  we  are  in  the 
riddle  rather  than  on  the  peaks?   Then  we  will  have  no  idea  of  the 
amplitude."  It  must  he  remembered  that  Figure  3  is  confusing  as  it  shows 
only  one  dimension  of  at  least  a  two-dimensional  space  of  solutions  y.  (An 
autonomous  problem  must  be  multidimensional  to  oscillate.)  Hence,  the  quasi 
envelope  is  "on  the  side"  of  the  oscillation.   If  the  solution  is  truly 
periodic,  it  is  a  fixed  point  in  phase  space. 

Notice  that  g(z,t)  in  eq.  (4)  is  a  function  that  can  be  calculated  by  an 
integration  of  the  original  ODE  over  one  period.   Ue  refer  to  this  as  the 
"inner  integration."  Now  we  want  to  find  a  technique  to  compute  z  from  eq. 
(3)  given  g.   The  technique  is  essentially  that  developed  by  other  authors 
(see  [6],  [10],  for  example).  The  objective  is  to  evaluate  g  periodically, 
say  at  points  t   «  NH  where  II  >>  T  and  use  this  information  to  approximate 
z.   The  standard  formulas  are  most  easily  developed  using  operator 
calculus.   Let  D  be  the  differentiation  operator,  and  let 
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Vv(t)  -  v(t)  -  v(t-H).   Let  z  be  the  computed  approximation  to  z(t„) 


From  eq.  (3)  we  have 


Hence 


TD  _ 
e   -I 


-1 


(5) 


ZN  "  ZN-1  =  V 


TD 


f- 


exp[-£log(I-V)  -  I 


=  H{[I  -  |v  -  i^2  -  ...  ] 


W         1 2  |H  |  V   +  "• 


(6) 


The  first  term  in  square  brackets  corresponds  to  the  Adams  Moulton  formula 
The  remaining  terms  ar  correction  terms  which  are  small  if  T/H  is  small. 
Graff  [7]  has  called  these  "generalized  Adams"  methods.   The  above  is  an 
implicit  method  because  g  depends  on  z...   We  could  equally  well  have 
arrived  at  an  implicit  generalization  of  Adams  Bashforth  methods.   The  two 
together  give  a  standard  predictor/corrector  pair  which  can  be  employed  in 
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the  usual  way.   Alternatively,  we  can  use  eq.  (5)  to  express  g„  in  terras  of 

z.,  and  Its  backward  differences  to  get  an  implicit  generalized  backward 
N 

differentiation  formula. 


When  we  use  a  formula  such  as  that  embodied  in  eq.  (6)  we  discard  high 
powers  of  V.   This  leaves  a  multistep  formula  of  the  form 


1.0Iai(r)zN-i  +  wi<r>W 


where  r  ■  T/II  and  a   and  6  .  are  polynomials  of  degree  k  in  r. 

A  practical  code  must  vary  its  stepsize  H  (and  the  order).   At  first  sight 
this  looks  to  be  a  complex  mess.   However,  Petzold  [12]  has  shown  that 
interpolatory  stepsize  changing  is  easy  if  an  analogue  of  the  Nordsieck 
vector  is  used  to  represent  the  history  of  the  solution.   Instead  of 
storing  [z,  Hz',  H2z(2)/2,  ....  Hkz(k)/k!]*  she  stores 

%  "  tzN*  "Vn,  "Vn72'  •—  ^V^"1^"* 
where 


VTz(t)  -  !"[*(t-ffl)  -  z(t)] 


When  this  form  is  used,  the  predictor  step  takes  the  form 
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%,(0)  "  A(r)%-1 


where  A(r)  is  the  Pascal  triangle  matrix  in  all  but  the  first  row  where  it 
is  a  polynomial  in  r  ■  T/H.   The  corrector  takes  the  form 


%,(tH-l)  -%,(*>  +^(r)F(^I,(m)) 


where  Y_  is  independent  of  r  in  all  but  its  first  component.   Those  of  you 
familiar  with  the  Nordsieck  scheme  (see  [3])  will  recognize  that  this  is  a 
minor  change  to  that  scheme.   In  fact,  the  stepand  order  changing 
algorithms  are  identical,  so  that  only  minor  changes  were  needed  to  DIFSUB 
(see  [3],  chapter  9)  to  adapt  it. 

Variable  Period 

A  variable  period  T(t)  is  handled  by  a  simple  change  of  independent 
variable.   Assume  we  know  T(t).   We  introduce  a  new  independent  variable  s 
such  that 

a(t+T(t))  -  s(t)  «  t,   s(0)  =  0 

where  t  is  the  constant  period  in  the  new  variable.   However,  what  we  need 
to  calculate  is  t(s)  so  that  we  can  evaluate  g(z  (t (s) ) , t  (s) ).   Ue  have 
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t(s+x)    -   t(s)    -  T(t(s)),         t(0)    -   0 
Rewriting    this    and   eq .     (3)    we   get 


z(t(s+T))    -   z(t(s))   +Tp^(z,t(s)) 


t(s+T  )     =     t(s)     +    T 


pisii] 


The  variable  t  is  handled  as  an  additional  dependent  variable.   We  solve 
both  of  these  equations  using  the  constant  period  formulas  given  above  with 
period  t,  where  g  is  replaced  by  T(t (s)  )g(z ,t (s) )  A  when  computing  z,  and 
by  T(t(s))A  when  computing  t. 

Period  Determination 

The  period  is  not  defined  for  a  non-periodic  function,  but  we  should  notice 
that  the  preceding  discussion  is  equally  applicable  to  any  function  and  any 
T(t),  whether  periodic,  "nearly  periodic"  (whatever  that  means),  or 
arbitrary.   If  y(t)  is  highly  oscillatory  and  T(t)  is  in  no  sense  a 
"period,"  the  function  z(t)  will  have  components  very  similar  to  the  high 
frequency  components  of  y,  so  H  must  be  small  for  accuracy  and  nothing  is 
gained.   However,  if  we  can  find  a  T(t)  such  that  z(t)  is  slowly  varying, 
we  can  hope  to  use  a  large  H. 

Let  us  suppose  that  the  solution  y  has  the  form 
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y(t)  -  w(t)  +  p(t) 

where  w  is  periodic  and  p(t)  is  slowly  varying.   Define  the  period  of  y  to 
be  the  value  of  T  such  that 


T 
I(T)  =  min  j*  [y(t+T+s)  -  p(t+T+s)  -  y(t+s)  +  p(t+s)]  ds 

peP  0  (7) 

is  minimum  for  T  >   T«,  where  P  is  some  finite  dimensional  class  of 

functions  (e.g.  polynomials  of  degree  m).   If  the  assumption  on  y  is  valid, 

and  if  p  is  in  P,  then  I (T )  is  zero  for  T  equal  to  multiples  of  the  period 

of  w.   (We  constrain  T  _>  T_  to  avoid  the  undesirable  solution  T  *  0. ) 

Given  y  and  T,  the  computation  of  p  from  (7)  is  straightforward  because  it 

is  a  quadratic  minimization  problem  in  p.   Minimizing  with  respect  to  T  is 

more  difficult,  so  we  have  used  a  few  steps  of  an  iteration  to  find  a  zero 

of  its  derivative.   In  practice,  we  have  found  that  we  can  take  p  ■  0 

because,  for  problems  of  interest,  p  is  so  slowly  varying  compared  to  w  it 

can  be  ignored. 

The  algorithm  for  solving  y'  ■  f(y,t)  now  takes  the  following  general  form: 

1.  Start  with  a  small  H  and  a  one-step  outer  integration.   An  initial 
estimate  of  T  must  be  provided.   y_  is  given. 

2.  Evaluate  T  ,  g-  (see  "subroutine"  below).   Set 
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■  -  °«  zo  "  V  co  ■  °* 

3.   Predict  t   .,  z   .  using  outer  integration. 

A.  Evaluate  TN+1(zN+i)»  %+l  ^ZN+1  ^  ^see  subroutlne)  • 

5.  Correct  t„  . ,  z„  .  using  outer  integration. 

6.  Repeat  steps  4  and  5  as  desired. 

7.  Perform  a  final  evaluation  of  T   .  and  gN,,  if  desired. 

8.  Increment  N.   Select  new  step  and  order.   Repeat  steps  3  to  8  as 
necessary. 

Subroutine  to  evaluate  T  and  g. 

51.  Set  y  -  z 

52.  Integrate  y'  ■  f(y,t)  for  two  cycles  using  inner  integration 

53.  Compute  new  T  with  an  iteration  applied  to  I'(T)  ■  0  (see  eq. 
(7)) 

SA.   Compute  g  from  y 

Nonautononous  Systems 

If  the  problem  has  high-frequency  driving  terms,  then  it  is  important  to 
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keep  phase  information  about  the  solution.   Referring  to  Figure  3,  note 
that  the  dashed  line  is  not  a  part  of  the  solution  of  the  original  problem, 
but,  for  an  autonomous  system,  it  is  essentially  a  phase  shift  of  a  nearby 
part  of  the  solution.   However,  if  we  advance  t  by  multiples  of  the  period 
H  only,  we  will  remain  in  phase.   All  this  means  is  that  we  restrict  H  to 
integer  multiples  of  T.   We  call  this  the  synchronized  mode.   Fortunately, 
when  there  are  oscillatory  forcing  functions,  we  usually  know  the  period 
exactly,  so  it  is  not  difficult  to  stay  synchronized.   Tests  on  a  few 
simple  problems  reported  in  [12]  indicate  speed  improvements  of  twenty-  and 
forty-to-one  for  a  nonlinear  autonomous  problem  (lightly  damped  large 
amplitude  pendulum)  and  a  linear  nonautonomous  problem  (linear  oscillator 
with  high  frequency  forcing  function).   The  error  in  the  generalized 
methods  was  slightly  better  than  that  of  DIFSUB. 

Error  Analysis 

The  interested  reader  is  referred  to  Petzold's  thesis  [12]  for  details. 
Essentially,  the  analysis  shows  that  the  inner  integration  error  is  the 
critical  one.   It  is  easy  to  pick  parameters  so  that  the  contribution  from 
the  outer  integration  is  small.   The  contribution  from  the  inner 
integration  will  be  about  the  same  as  If  the  Inner  Integration  were  to  be 
used  over  the  full  range.   (This  should  not  be  viewed  lightly.   If  the 
oscillation  is  very  fast,  It  may  be  Impossible  to  solve  over  the  whole 
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ranpe  because  of  roundoff!   The  problem  can  be  seen  by  examining  g.   We  see 
from  eq.  (4)  that  as  T  >  0,  errors  in  computing  y  over  one  cycle  of  the 
inner  integration  grow  with  — . ) 

4   Conclusion 


There  are  many  problems  still  to  solve  before  the  universal  automatic 
integrator  can  be  written.   Among  these  are  the  development  of  reliable 
schemes  for  determining  the  nature  of  the  equation  and  solution  (stiff, 
oscillatory,  etc)  in  order  to  select  a  method.   Also  the  control  of  global 
error  in  an  efficient  manner  presents  a  challenge  that  is  likely  to  be 
frustrated  by  the  lack  of  theory.   The  asymptotic  dependence  of  the  global 
error  on  the  local  error  control  parameter  needs  to  expressed  in  a  form 
that  is  useful.   Unfortunately,  it  appears  that  with  most  popular  local 
error  control  schemes  it  does  not  have  a  useful  form.   Also,  there  is  a 
lack  of  theory  for  "large"  errors,  ones  for  which  asymptotic  analysis  is 
inappropriate.   In  practice,  we  compute  with  a  step  size  that  is  determined 
once  the  problem  and  error  control  are  specified.   Hence,  we  are  not 
interested  in  the  behaviour  of  the  error  as  the  step  decreases,  but  in  the 
actual  error  committed.   Ue  would  like  to  minimize  that  over  a  choice  of 
methods  (or  rather,  maximize  h  for  a  constant  error).   Thus,  we  are  not  the 
least  interested  in  the  order  of  the  method,  nor  its  convergence.   A  method 
with  zero  order  (and  hence  not  convergent)  might  be  the  best  method  for  the 
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particular  error  and  problem.  We  have  no  theory  to  build  on  these  ideas. 
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Figure  1.   Nearly  discontinuous  diode  current-voltage  relationship. 
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Figure  2.   Envelope  of  a  solution. 
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Figure  3.      Quasi   Envelope. 
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